Skip to content

Conversation

@katrinafyi
Copy link
Contributor

this more clearly documents the behaviour of these flags. this makes no changes to behavior. as always, this documents the usage as i observe right now and does not presume that this behaviour is good or correct.

i also hope to clarify some more subtle points:

  • the base-url is uniformly applied to every local file being checked, regardless of the filesystem path of each local file. this means that a relative link a/b resolves to exactly the same URL no matter which local file it appears in.
  • base-url is affected by trailing slashes.
  • define behavior when root-dir and base-url are both specified. namely, root-dir becomes a subpath of the base-url's domain.

honestly, the more i think about it, the more i think this behavior is wrong or has diverged from its original intention. for instance, let's look at the docs recipe about --root-dir:

Sometimes you need both:

lychee \
  --root-dir "$(pwd)/public" \
  --base-url https://example.com/ \
  "public/**/*.html"

This tells lychee:

  1. Look for /-prefixed files in ./public/
  2. Resolve relative links against https://example.com/

with the above points in mind, this example is nonsensical. firstly, the root-dir will make absolute links become something like https://example.com/path/to/pwd/public/absolute-link which is unlikely to exist on the remote host. this is in contradiction with point (1) in the quote. secondly, /every/ relative link is resolved against exactly https://example.com/, without considering its position relative to root-dir. this seems wrong too.

but in any case, this PR clarifies the behavior as it is right now. this PR is mostly just my findings from #1718 (comment)

this more clearly documents the behaviour of these flags. this makes no
changes to behavior. as always, this documents the usage as i observe
right now and does not presume that this behaviour is good or correct.

i also hope to clarify some more subtle points:
- the base-url is uniformly applied to every local file being checked,
  regardless of the filesystem path of each local file. this means that
  a relative link `a/b` resolves to exactly the same URL no matter which
  local file it appears in.
- base-url is affected by trailing slashes.
- define behavior when root-dir and base-url are both specified. namely,
  root-dir becomes a subpath of the base-url's domain.

honestly, the more i think about it, the more i think this behavior is
wrong or has diverged from its original intention. for instance, let's
look at [the docs recipe about `--root-dir`][1]:

[1]: https://lychee.cli.rs/recipes/root-dir/#using-both-together

> Sometimes you need both:
>
> ```
> lychee \
>   --root-dir "$(pwd)/public" \
>   --base-url https://example.com/ \
>   "public/**/*.html"
> ```
>
> This tells lychee:
>
> 1. Look for /-prefixed files in ./public/
> 2. Resolve relative links against https://example.com/

with the above points in mind, this example is nonsensical. firstly, the
root-dir will make absolute links become something like
`https://example.com/path/to/pwd/public/absolute-link` which is unlikely
to exist on the remote host. this is in contradiction with point (1) in
the quote. secondly, /every/ relative link is resolved against exactly
`https://example.com/`, without considering its position relative to
root-dir. this seems wrong too.

but in any case, this PR clarifies the behavior as it is right now.
/// Root path to use when checking absolute local links,
/// must be an absolute path
#[arg(long)]
/// Root path to use when checking absolute links in local files.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the opportunity to make this change as well?

Suggested change
/// Root path to use when checking absolute links in local files.
/// Root directory to use when checking absolute links in local files.

(Same for long_help.)

Comment on lines 670 to 671
absolute link is resolved as: the domain name of the base URL, then the root
dir path, then the absolute link's path."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That confused me a bit, mostly because the two parameters get confusing when used in combination. Is there an easier way to explain that? I can't come up with a good alternative right now, but maybe you have an idea. I liked your examples in the help message, so maybe another example would help or an ordered list of steps?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the two parameters get confusing when used in combination

You can say that again #1718 😄

Copy link
Contributor Author

@katrinafyi katrinafyi Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an easier way to explain that?

I'm not sure. as some background, in #1718, I wrote

<a href="relative/1"> --> {parent(base-url)}/relative/1
<a href="/absolute/1"> --> {origin(base-url)}{root-dir}/absolute/1

which I think is pretty clear. what you see in this PR is an attempt to translate that into prose.

it does lose some formality, but i worry that if we make it too explicit, it starts becoming more of a formal specification. this is not bad (indeed, it's good to be precise about these things) but I think help text isn't the right place for such formality. I want to make sure it doesn't get too overwhelming.

but anyway, maybe we can make it clearer with some phrasing changes:

Suggested change
absolute link is resolved as: the domain name of the base URL, then the root
dir path, then the absolute link's path."
absolute link is resolved to: the
domain name of --base-url, joined
with --root-dir, then joined with the
absolute link's path."

what're your thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like that much better. What do you think of this version?

Suggested change
absolute link is resolved as: the domain name of the base URL, then the root
dir path, then the absolute link's path."
absolute links are resolved by constructing a URL from three parts:
the domain name specified in `--base-url`, followed by the `--root-dir`
directory path, followed by the absolute link's own path."

@mre
Copy link
Member

mre commented Aug 14, 2025

It is amazing that we start to properly document these options.
Added some comments, but your description is already much better than anything I ever came up with.

@mre mre merged commit 71204bd into lycheeverse:master Aug 28, 2025
6 checks passed
@mre
Copy link
Member

mre commented Aug 28, 2025

Not sure why we kept this open for so long, sorry. It was ready to be merged earlier. 😅 Anyhow, it's merged now.

This was referenced Aug 28, 2025
@katrinafyi
Copy link
Contributor Author

No worries, thanks! I think i just forgot to reply that I'd fixed the review comments :)

This was referenced Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants